The MongoDB Node.js driver automatically splits a large bulk operation that exceeds the 16MB BSON limit into smaller batches to ensure successful execution
The MongoDB Node.js driver handles bulk operations that exceed the 16MB BSON document size limit by automatically splitting them into smaller, manageable batches . This is a built-in feature of the driver that shields developers from having to manually manage the 16MB limit. When you submit a large bulk operation, the driver calculates the size of each operation and groups them into batches that stay within both the 16MB BSON size limit and the server's maxWriteBatchSize limit (which is 100,000 operations) . These batches are then sent sequentially to the server, and the driver reassembles the results before returning a unified response.
The driver implements a sophisticated batching mechanism that evaluates each operation's BSON size before including it in a batch. The driver uses bson.calculateObjectSize() to determine the exact size of each operation . When adding operations to a batch, the driver checks two critical thresholds: the per-batch operation count (default 1000, but server limit is 100,000) and the cumulative BSON size (approaching 16MB). If adding another operation would exceed either threshold, the driver seals the current batch, sends it to the server, and begins a new batch for remaining operations . This process continues until all operations have been executed.
The driver's size calculation is precise. When building a batch, it tracks both the currentBatchSize (number of operations) and currentBatchSizeBytes (total BSON size). For each new operation, it calculates maxKeySize + bsonSize and checks if this would push the batch over the limit . The driver specifically sets ignoreUndefined: false during BSON size calculation to ensure accurate sizing, even when documents contain undefined values that might otherwise be omitted from size estimates . This prevents the under-counting that could otherwise cause batches to exceed the 16MB limit at the server level.
This automatic splitting behavior was not always present in all MongoDB tools. In older versions of the mongo shell (prior to 3.2), bulk operations that exceeded 16MB would simply fail with an error like 'Object size 16795903 exceeds limit of 16793600 bytes' . This was fixed in a Jira ticket (SERVER-23107) that improved the shell to split by size, bringing it in line with the behavior of the Python and C drivers. The Node.js driver has long included this intelligent batching behavior, making it robust for large-scale data operations . Modern versions of the driver (3.6+) continue to handle this seamlessly .
16MB BSON Limit: This is the absolute maximum size for any single message sent to MongoDB. The driver ensures no batch exceeds this .
Max Write Batch Size: The server limits each batch to 100,000 operations (as of MongoDB 3.6+). The driver respects this and splits purely by operation count if needed .
No Operation Splitting: Individual operations are never split across batches. If a single operation itself exceeds 16MB, the driver will throw an error immediately because such an operation cannot be sent .
Ordered vs Unordered: The driver maintains the ordered/unordered setting when splitting batches. For ordered operations, it sends batches sequentially; for unordered, it can send multiple batches in parallel .
Error Handling: If a batch fails, the driver reports errors appropriately. For ordered operations, it stops after the failing batch; for unordered, it continues with remaining batches .
While the driver handles splitting automatically, you can optimize performance by being mindful of your operation sizes. For documents approaching the 16MB limit, even a single operation may require its own batch, which can impact throughput. For optimal performance with large datasets, consider batching 1,000-5,000 operations per bulk write call, as this gives the driver good opportunities for parallelization without creating excessive overhead . Also remember that unordered operations (ordered: false) allow MongoDB to execute batches in parallel, which can significantly improve throughput for independent operations .